35 research outputs found
Recommended from our members
The Blue Obelisk Community
Poster presented at the VSMF symposium held at the Unilever Centre on 2011-01-17The Internet has brought together a group of chemists who are driven by wanting to do things better, but are frustrated with the Closed systems that chemists currently have to work with. they share a belief in the concepts of Open Data, Open Standards and Open Source. And they express this in software, data, algorithms, specifications, tutorials, demonstrations, articles and anything that helps get the message across. [http://www.blueobelisk.org/
Surface studies and density functional theory analysis of ruthenium polypyridyl complexes
In recent years, the computational method Density Functional Theory (DFT) has become more and more important as an effective tool for studying inorganic complexes. This thesis describes computational studies on ruthenium polypyridyltype complexes using DFT. An introduction to the theory behind DFT is presented in the first chapter, as well as a review of previous DFT studies on ruthenium polypyridyl-type complexes.
The second chapter describes the details of the computational studies. This includes a description of the basis set, functional, and integration grid. This chapter also describes two pieces of in-house software: GaussSum written to process the output of the computational package Gaussian, and GauStock, which is used to calculate Hirshfeld atomic charges.
Chapter 3 examines the electronic structure of a series of complexes related to [Ru(bpy)2(pytrz)]+. The calculated electronic structure is compared with results from experiment. Partial density of states (PDOS) spectra are used to visualise the results. Linkage isomerism and methylation reactions are examined using thermodynamics and, in the case of methylation reactions, also with kinetics.
Chapter 4 compares the electronic structure of dinuclear complexes with their corresponding mononuclear analogues. PDOS spectra are used to highlight the changes that occur on addition of a second metal centre.
The quality of the predicted Raman frequencies of [Ru(bpy)3]2+ is the focus of Chapter 5. The effect of basis set size, grid size and the inclusion of solvent effects is discussed. The results are compared with the experimental values.
Chapter 6 presents the first DFT study of an osmium complex attached to a surface, in this case, to a gold (111) surface. A cluster model is used for the surface. The effect of adsorption on the energy levels of the complex is studied. The effect of oxidation on the adsorbate-substrate bond is also examined.
Chapter 7 is an overview of the information available from DFT calculations. DFT is a very useful tool for examining the electronic structure of ruthenium polypyridyl complexes. PDOS spectra highlight changes in electronic structure between related complexes. Trends in oxidation potential are reproduced by the position of the metal PDOS peak. Predicted Raman frequencies agree well with experiment, although a scaling factor is required. Adsorption of an osmium complex on a gold surface causes molecular orbitals close to the surface to shift, while the relative positions of other molecular orbitals remain unchanged. The oxidised complex binds more strongly, due to the change in the nature of the frontier orbitals
Computational design and selection of optimal organic photovoltaic materials
Conjugated organic polymers are key building blocks of low-cost photovoltaic materials. We have examined over 90 000 copolymers using computational predictions to solve the "inverse design" of molecular structures with optimum properties for highly efficient solar cells (specifically matching optical excitation energies and excited-state energies). Our approach, which uses a genetic algorithm to search the space of synthetically accessible copolymers of six or eight monomer units, yields hundreds of candidate copolymers with predicted efficiencies over 8% (the current experimental record), including many predicted to be over 10% efficient. We discuss trends in polymer sequences and found in the most frequent monomers and dimers in these highly efficient targets and derive design rules for the selection of appropriate donor and acceptor molecules. We show how additional computationally intensive filtering steps can be used, for example, to eliminate targets likely to have poor hole mobilities. Our method effectively targets optimum electronic structure and optical properties far more efficiently than time-consuming serial experiments or computational studies and can be applied to similar problems in other areas of materials science
Recommended from our members
Pybel: a Python wrapper for the OpenBabel cheminformatics toolkit.
BACKGROUND: Scripting languages such as Python are ideally suited to common programming tasks in cheminformatics such as data analysis and parsing information from files. However, for reasons of efficiency, cheminformatics toolkits such as the OpenBabel toolkit are often implemented in compiled languages such as C++. We describe Pybel, a Python module that provides access to the OpenBabel toolkit. RESULTS: Pybel wraps the direct toolkit bindings to simplify common tasks such as reading and writing molecular files and calculating fingerprints. Extensive use is made of Python iterators to simplify loops such as that over all the molecules in a file. A Pybel Molecule can be easily interconverted to an OpenBabel OBMol to access those methods or attributes not wrapped by Pybel. CONCLUSION: Pybel allows cheminformaticians to rapidly develop Python scripts that manipulate chemical information. It is open source, available cross-platform, and offers the power of the OpenBabel toolkit to Python programmers
Simultaneous feature selection and parameter optimisation using an artificial ant colony: case study of melting point prediction.
BACKGROUND: We present a novel feature selection algorithm, Winnowing Artificial Ant Colony (WAAC), that performs simultaneous feature selection and model parameter optimisation for the development of predictive quantitative structure-property relationship (QSPR) models. The WAAC algorithm is an extension of the modified ant colony algorithm of Shen et al. (J Chem Inf Model 2005, 45: 1024-1029). We test the ability of the algorithm to develop a predictive partial least squares model for the Karthikeyan dataset (J Chem Inf Model 2005, 45: 581-590) of melting point values. We also test its ability to perform feature selection on a support vector machine model for the same dataset. RESULTS: Starting from an initial set of 203 descriptors, the WAAC algorithm selected a PLS model with 68 descriptors which has an RMSE on an external test set of 46.6 degrees C and R2 of 0.51. The number of components chosen for the model was 49, which was close to optimal for this feature selection. The selected SVM model has 28 descriptors (cost of 5, epsilon of 0.21) and an RMSE of 45.1 degrees C and R2 of 0.54. This model outperforms a kNN model (RMSE of 48.3 degrees C, R2 of 0.47) for the same data and has similar performance to a Random Forest model (RMSE of 44.5 degrees C, R2 of 0.55). However it is much less prone to bias at the extremes of the range of melting points as shown by the slope of the line through the residuals: -0.43 for WAAC/SVM, -0.53 for Random Forest. CONCLUSION: With a careful choice of objective function, the WAAC algorithm can be used to optimise machine learning and regression models that suffer from overfitting. Where model parameters also need to be tuned, as is the case with support vector machine and partial least squares models, it can optimise these simultaneously. The moving probabilities used by the algorithm are easily interpreted in terms of the best and current models of the ants, and the winnowing procedure promotes the removal of irrelevant descriptors
Elucidating excited state electronic structure and intercomponent interactions in multicomponent and supramolecular systems
Rational design of supramolecular systems for application in photonic devices requires a clear understanding of both the mechanism of energy and electron transfer processes and how these processes can be manipulated. Central to achieving these goals is a detailed picture of their electronic structure and of the interaction between the constituent components. We review several approaches that have been taken towards gaining such understanding, with particular focus on the physical techniques employed. In the discussion, case studies are introduced to illustrate the key issues under consideration
Recommended from our members
Comparison of structure- and ligand-based scoring functions for deep generative models: a GPCR case study.
Deep generative models have shown the ability to devise both valid and novel chemistry, which could significantly accelerate the identification of bioactive compounds. Many current models, however, use molecular descriptors or ligand-based predictive methods to guide molecule generation towards a desirable property space. This restricts their application to relatively data-rich targets, neglecting those where little data is available to sufficiently train a predictor. Moreover, ligand-based approaches often bias molecule generation towards previously established chemical space, thereby limiting their ability to identify truly novel chemotypes. In this work, we assess the ability of using molecular docking via Glide-a structure-based approach-as a scoring function to guide the deep generative model REINVENT and compare model performance and behaviour to a ligand-based scoring function. Additionally, we modify the previously published MOSES benchmarking dataset to remove any induced bias towards non-protonatable groups. We also propose a new metric to measure dataset diversity, which is less confounded by the distribution of heavy atom count than the commonly used internal diversity metric. With respect to the main findings, we found that when optimizing the docking score against DRD2, the model improves predicted ligand affinity beyond that of known DRD2 active molecules. In addition, generated molecules occupy complementary chemical and physicochemical space compared to the ligand-based approach, and novel physicochemical space compared to known DRD2 active molecules. Furthermore, the structure-based approach learns to generate molecules that satisfy crucial residue interactions, which is information only available when taking protein structure into account. Overall, this work demonstrates the advantage of using molecular docking to guide de novo molecule generation over ligand-based predictors with respect to predicted affinity, novelty, and the ability to identify key interactions between ligand and protein target. Practically, this approach has applications in early hit generation campaigns to enrich a virtual library towards a particular target, and also in novelty-focused projects, where de novo molecule generation either has no prior ligand knowledge available or should not be biased by it
Recommended from our members
Comparison of structure- and ligand-based scoring functions for deep generative models: a GPCR case study.
Deep generative models have shown the ability to devise both valid and novel chemistry, which could significantly accelerate the identification of bioactive compounds. Many current models, however, use molecular descriptors or ligand-based predictive methods to guide molecule generation towards a desirable property space. This restricts their application to relatively data-rich targets, neglecting those where little data is available to sufficiently train a predictor. Moreover, ligand-based approaches often bias molecule generation towards previously established chemical space, thereby limiting their ability to identify truly novel chemotypes. In this work, we assess the ability of using molecular docking via Glide-a structure-based approach-as a scoring function to guide the deep generative model REINVENT and compare model performance and behaviour to a ligand-based scoring function. Additionally, we modify the previously published MOSES benchmarking dataset to remove any induced bias towards non-protonatable groups. We also propose a new metric to measure dataset diversity, which is less confounded by the distribution of heavy atom count than the commonly used internal diversity metric. With respect to the main findings, we found that when optimizing the docking score against DRD2, the model improves predicted ligand affinity beyond that of known DRD2 active molecules. In addition, generated molecules occupy complementary chemical and physicochemical space compared to the ligand-based approach, and novel physicochemical space compared to known DRD2 active molecules. Furthermore, the structure-based approach learns to generate molecules that satisfy crucial residue interactions, which is information only available when taking protein structure into account. Overall, this work demonstrates the advantage of using molecular docking to guide de novo molecule generation over ligand-based predictors with respect to predicted affinity, novelty, and the ability to identify key interactions between ligand and protein target. Practically, this approach has applications in early hit generation campaigns to enrich a virtual library towards a particular target, and also in novelty-focused projects, where de novo molecule generation either has no prior ligand knowledge available or should not be biased by it
Investigating the influence of the sulfur oxidation state on solid state conformation
Design, synthesis and structural characterization of a series of diphenylacetylene derivatives bearing organosulfur, amide and amine moieties has been achieved in which the molecular conformation is controlled through variation of the hydrogen bond properties on alteration of the oxidisation level of sulfur